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EPITOPES Of THE ENV PROTEIN OF THE HEPATITIS C VIRUS 

This invention relates to epitop s of the env 
protein of the hepatitis C virus. 

More particularly, this invention relates to 
peptides comprising epitopes of the hepatitis C virus 
(HCV) localized in the envelope surface viral protein 
(env) , which are capable of reacting with antisera 
and/ or with monoclonal antibodies and it also relates 
to the amino acids sequence of said epitopes, as well 
as to the nucleotide sequence coding for the sames. 

The documents cited with a numeral reference are 
listed at the end of this disclosure. 

The virus HCV is believed to be responsible for 
the hepatites classified as non-A/non-B (PT-NANB) (1) . 
The existence of an etiological agent for MANB 
hepatitis has been also proved by Alter et al. (2) . The 
virus has been identified as an RNA virus, of positive 
polarity, and the genome, in the form of cDNA, has been 
wholly cloned and sequenced. From an analysis of the 
sequence it turned out that the sequence in question 
consists of about 10,000 ribonucleotides and forms a 
single reading frame that potentially codes for a 
single amino acid chain. This same organization is also 
present in other viral families such as those of 
flavivirus and of Pestivirus; however, other structural 
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cnarac^ristics »aXe it uncertain to set forth a 

precise taxonomic position of HCV. 

first portion of the genome has 
The cloning of a first pot*-* 

^endisclce. W Coo. CL. .t .1- (3,. »a 

, s^ence has be«. ^H'^ i" ^ 
SS310922.5. The regions identifies correspond to the 
so-caued nonstructural regions which, in a way similar 
to that of flavivirus. have heen called HSl, KS2, HS3. 

NS4 and NS5. 

More recently, structural regions, coding for 
" capsid and for surface proteins, have heen cloned and 
se^enced. Such sequences have been published hy 

. » .t al (4) and in the European Patent 

Okamoto, H. et ai. 

application EP 90302866.0. 

I„ order to Identify i-nunological markers of HCV 
infection, large amounts of viral antigens are needed. 
However, differently from oth«: hepatotropic viruses, 
such as HBV ana HDV, the concentration of HCV in the 
Uver and in the hlood is very low and, differenUy 
from the virus of hepatitis* (HAV,,HCV cannot be 
" grown Therefore it is not available a good 

natural source of viral antigens. 

Accordingly, the preparation of immunological 
tests requires the availability of synthetic peptides 
capable of mimicKing the immunological activity of 
" viral antigens. To that aim, the identification of 

specific protein portions, denominated epitopes. 
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capable of reacting with antibodies is n cessary due to 
the short length of synthetic peptides. Moreover, it is 
well known that tests which mploy just the epitope of 
the protein are more sensitive and more accurate. 
5 Up to the present time it has been impossible to 

identify portions with antigenic activity of HCV env 
protein, capable of reacting with antibodies and, 
therefore, the env protein or portions thereof has 
never been employed for immunological tests. 

10 It is well known that RNA viruses are 

characterized by a high frequency of spontaneous 
mutation. In the case of HCV, variable and 
hypervariable domains have been identified in the 
sequences corresponding to the surface proteins (5 and 

15 EP 004191. 82A1) , possibly related to viral mechanisms 
of escaping of the immune response. Moreover NANB 
hepatitis becomes a chronic disease in about 50 % of 
patients. It is therefore very useful to identify 
epitopes of svirface proteins both for diagnosis and for 

20 prognosis purposes. 

The Authors of this invention have identified 
variable regions with a high antigenic activity of the 
amino acid sequence of the env protein, and they have 
found that such regions correspond to epitopes of said 

25 protein. The Authors also have identified some variants 
of such regions by means of amplification of nucleic 
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acids from serum samples; among such regions, one is 
coded by a HCV variant not disclosed before. 

The endemic distribution of the different viral 
variants of HCV virus makes it necessary to prepare 
5 assays able to detect epitopes of the different 
variants . 

The Authors have synthesized such epitopes in 
vit:ro for immunological assays on serum samples. 

The availability of an anti-env marker with 

10 serological characteristics such as those of the object 
of this invention, can lead to more specific tests, 
which can be particularly employed for anti-HCV scree- 
ning of blood samples. Indeed, an analysis, carried out 
by contreras et al. (6) just employing the test based 

15 on the olOO protein gives rise to a remarkable number 
of false positive results, with no precise ident- 
ification of the infected samples. 

Finally, as tests which employ amplification 
procedures such as the PCR (polymerase chain reaction) 

20 are not exploitable for massive screenings, it is 
useful to correlate the positive results obtained with 
the assay realized by the Authors and the results 
obtained with the PCR. 

Accordingly, it is a specific object of this 

25 invention an amino acid sequence comprising an epitope 
of the env protein of the HCV virus, preferably in the 
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p rti n from th amino acid 209 to the amino acid 259, 
according to the numeration as given in (3) . 

According to some preferred embodiments f this 
invention, said sequences are included in the following 
5 group of sequences: SEQ ID Nl, SEQ ID N2 and SEQ ID M3; 
preferably from the amino acid 13 to the amino acid 46 
of SEQ ID Nl, more preferably from the amino acid 21 to 
the amino acid 30 of SEQ ID Nl; alternatively from the 
amino acid 13 to the amino acid 46 of SEQ ID N2, 

10 preferably from the amino acid 21 to the amino acid 30 
of SEQ ID N2; alternatively from the amino acid 13 to 
the amino acid 47 of SEQ ID N3, preferably from the 
amino acid 21 to the amino acid 31 of SEQ ID N3. 

It is a further object of the invention a peptide 

15 according to any of said amino acid sequences, prefe- 
rably of synthetic origin, more preferably cyclized by 
means of reaction of two residues of cysteine • 

Again it is an object of this invention a 
nucleotide sequence coding for an epitope of the env 

20 protein, preferably comprised in one of the sequences 
of the following group: SEQ ID Nl, SEQ ID N2, and SEQ 
ID N3; preferably comprising at least one fragment of 
10 nucleotides of SEQ ID N3, more preferably the entire 
sequence of SEQ ID N3. 

25 This invention will be now disclosed in some 

working examples of the same, with reference to the 
following figures, wherein: 



PCr/IT92/0008I 

WO 93/02103 



- Figures lA, IB, and IC represent the hydr philic 
profiles respectively of the env 1, env 2 and env 3 
variants . 

EXAMPLE 1 j-^^r.^if<n.^tinn nf 3 vflriaTiti 1ti ttif rpgioH Qf 
5 r nno^ P in T"'^ eyn^hp^-i^ of thp rorrpspnTiding 

An investigation carried out by means of nucleic 
acid amplification procedures from serum samples (PCR, 
7) allowed the identification of 3 main variants of the 

10 env surface protein to be- carried out, said variants 
being called respectively env 1, env 2, env 3, and 
comprising the sequences disclosed respectively as SEQ 
ID Nl, SEQ ID N2 and SEQ ID N3. 

The variant env 1 and env 2 are comprised in 

15 viral variants respectively known by those skilled in 
the art as HCV Al (american isolate) and HCV Jl (japan 
isolate) . The variant env 3 is coded by a viral variant 
which is not included in any HCV isolate disclosed up 
to the present invention, denominated HCV 3. Such 

20 variant differentiates mainly by the insertion of a 
histidine residue into a region delimited by 2 cyste- 
ines, which modifies the hydrophilic profile of' the 
genie product (Figures lA, IB and IC) . Such modif- 
ication is of particular relevance for the analogy with 

25 the transmembrane region of the HIVl surface protein 
(8, 9). 
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Oligopeptides comprising respectively th seque- 
nce of the env protein from the amino acid 13 to the 
amino acid 32 of the SEQ ID Nl (the env 1 variant) ; th 
sequence of the env protein from the amino acid 13 to 
5 the amino acid 32 of SEQ ID N2 (the env 2 variant) ; the 
sequence of the env protein from the amino acid 13 to 
the amino acid 33 of SEQ ID N3 (the env 3 variant) are 
synthesized according to Merrifield's method (10), 
employing as the solid phase a polyamide resin "Pepsin 

10 K polyamide Kieselguliz" (Milligen, Novato, 
California) , which had been previously functionalized 
with ethilendiamine and with 4-(alpha-Fmoc-amino- 
2 ' , 4 • -dimethoxy benzyl) phenoxyacetic acid . The amino 
acids employed for the synthesis are protected on the 

15 side chains by tert-butyl groups and on the alpha-amino 
position with the F-moc group 

(9-fluoro-methyloxycarbonyl group). The guanidinium 
group of arginine and the imidazole group of histidine 
is respectively protected with the substituents 

20 consisting of the 2 , 2 , 5 , 7 , S-pentamethylchroman- 
e-sulfonyl and trityl groups. The carboxy group of the 
amino acids employed is activated by the formation of 
an ester-type bond with the pentafluorophenyl group. 
The synthesis is performed with the Milligeii 9050 

25 synthesizer (Novato, California) employing the 
continuous flow method. The removal of protection and 
the separation of the peptides from the resin are 
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carried out by treatment with trifluoroacetic acid. The 

^uofVttA with an automatic 
peptide sequence is checked vzxa a« 

microsequencer (Portan Instruments) . 
EXAMPLE 2 

Oligopeptides comprising respectively the seque- 
nce of the env protein from the amino acid 21 to the 
amino acid 30 of the SEQ ID Nl (the env 1 variant) ; the 
sequence of the env protein from the amino acid 21 to 
the amino acid 30 of SEQ ID N2 (the env 2 variant) ; the 
sequence of the env protein from the amino acid 21 " to 
the amino acid 31 of SEQ ID N3 (the env 3 variant) are 
synthesized according to the Example 1. 

The cyclization of a fraction of the peptides is 
carried out in the following way: the peptide is 
dissolved in water to a concentration of 0.1 mg/ml. The 
pH value is adjusted to 7 with IM HH.OH. Potassium 
ferricyanide is then added slowly to the solution (400 
mg K3Fe(CN)g in 200 ml of water) till persistence of 
the yellow colour. The disappearance of the free SH 
groups is obtained employing the method of Edman (11) .. 

Alternatively the peptide is dissolved at 0.2 
mg/ml in distilled/deionized water (Milliq) and the pH 
is adjusted to pH 8 using a solution of 3M NH.Cl. The 
solution is allowed to stir for four days and the loss 
of the free sulphide groups is monitored using th 
Edman titration. Briefly, 24 mg of 5-5 • dithio-bis 
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(2-nitroben2oic acid) is dissolved in 5 ml of phosphate 
buffer pH ?• 20 m1 of this soluti n is mix d with 1 ml 
of the peptide solution and the absorbance is read at 
412 nm. After four days 96% of the free sulphide groups 
5 aire disappeared. 

EXAMPLE 3 Tirnniinnlogiral assav 

In order to determine the immunogenicity of 
linear and cryclized peptides described in EXAMPLE 2, an 
ELISA assay is carried out. 

10 The cyclic and linear peptides are dissolved in 

50 mM carbonate buffer, pH 9.6 at a concentration of 5 
Mg/ml. 200Ml/well of a microtitration plate is 
dispensed and incubated for 1 hr at 37*C. The 
overcoating of the wells is performed by coating to the 

15 empty wells 300 m1 of a solution containing 50 mM 
Tris-HCl pH 7.4 and 0.2% bovine serum albumin {BSA, 
Sigma, Fraction V) . The plates are incubated for 2 hrs 
at room temperature. 

Finally 300 /il/well of a solution containing 10% 

20 sucrose, 4% polyvinylpirrolidone and 9% NaCl is added 
and left f or 1 hr at room temperature. 

The ELISA assay is performed by dispensing 200 
/il/well of sera, previously diluted, using a HCV 
negative serum, as sample diluent. The samples are 

25 incvibated for 1 hr at 37»C. The plates are then washed 
five times with a solution containing 0.05% Tween-20, 
0.1% BSA in 50mM phosphate buffer pH 7.4 (washing 
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10 

buffer) and incubat d f orl hr at 27oc with 200 Ml of a 
solution containing goat igG anti-human IgGs, 
conjugated with horse radish perojcidase (HRP) . 

After five washings with washing buffer the 
5 plates are incubated for 30 min with a 
chromogen-substrate solution (tetraiaethylbenzidine and 
3% hydrogenperoxide) . The reaction is stopped with IN 
sulphuric acid and the absorbance is read at 450nin. 

The serum utilized (21) belongs to the panel BBI 
10 mixed HCV (Boston Biomedica Inc. ) • The control HCV 
negative serum gives constantly values lower than 0.04. 

The results are shown in the following Table l. 
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Table 1 

ELISA assay with serum 21 BBI 
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env 


1 


env 


2 


env 


3 




cyclic 


linecor 


cyclic 


linear 


cyclic 




serum 


OD 


OD 


OD 


Ou 






dil. 


450 


450 


450 


450 


- 450 




1:20 


2.150 


0.141 


1.648 


0.114 


1.777 


0.132 


1:40 


1.028 


0.093 


0.841 


0.095 


0.980 


0.089 


1:80 


0.615 


0.078 


0.512 


0.071 


0.546 


0.076 


1:160 


0.243 


0.074 


0.221 


0.064 


0.150 


0.070 


1:320 


0.098 


0.061 


0.093 


0.051 


0.090 


0.054 


1:640 


0.061 


0.060 


0.048 




0.054 


0.056 



The results show that env 1, env 2 and env 3 
peptides are able to react with anti HCV sera. The 
reactivity is greatly increased when such peptides are 
made cyclic and therefore have a conformational 
structure similar to the corresponding region of the 
whole env protein. The reactivity decreases 
proportionally with senm diluitions, thus indicating 
that the reaction is specific. 

This invention has been disclosed with specific 
reference to some preferred embodiments of the same, 
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but it is to be understood that modifications and/or 
Chang s can b introduced by thos who are skilled in 
the art without departing from the spirit and scope of 
the invention for which a priority right is claimed. 

5 



10 



15 



20 



25 
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LIST OF THE SEQUENCE CHARACTERISTICS 

SEQ ID Nl 

SEQUENCE TYPE: Nucl otid with corresponding peptide 
LENGTH OF THE SEQUENCE: 153 base pairs 
5 CONFORMATION: single helix 
TOPOLOGY: linear 

MOLECULAR TYPE: cDNA from. genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI-SENSE: no 
10 ORIGINAL SOURCE: HCV virus variant Al 

EXPERIMENTAL SOURCE: genie library from viral isolate 
CHARACTERISTICS: coding for a portion of env protein 
variant env 1 

IDENTIFICATION METHOD: experimental 
15 PROPERTY: coding sequence 

AAC TCG AGC ATT GTB TAC BAG GCT BCC 6AC 30 
Asn Ber Ser lie Val Tyr Blu AU Ala Asp 
1 5 10 

GCC ATC CTG, CAC ACT CCG BEG TGC GTC CCT 60 
Ala lie Leu His Thr Pro Gly Cys Vel Pro 



20 



25 



n -15 HO 

TGC BTT CBC GAG GBT AAC GCC TCB ABB TGT 90 

Cys Val Arg Glu Gly Asn Ala Ser Arg Cys 

£1 25 3° 

TGB GTG GCG ATC ACC CCC ACG GTB BCC ACC iSO 

Trp Val Ale He Thr. Pro Thr Val Ala Thr 

31 35 

ABB GAT GGC AAA CTC CCC ACA GCG CAC GTT 150 

Arg Asp Gly Lys Leu Pro Thr Ala His Vel 
A5 ' 50 

CBA 
Arg 
51 
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SEQ ID N2 

SEQUENCE type: Nucleotide with corresponding protein 
LENGTH OF THE SEQUENCE: 153 base pairs 
CONFORMATION: single helix 
TOPOLOGY: linear 

MOLECULAR TYPE: cDNA from genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI-SENSE: no 

ORIGINAL SOURCE: HCV virus variant Jl 
EXPERIMENTAL SOURCE: genic library from viral isolate 
CHARACTERISTICS: coding for a portion of env protein 
env 2 variant 

IDENTIFICATION METHOD: experimental 
PROPERTY: coding sequence 

AAC TCA ABC ATC GTS TAT BAB GCA GCA BAC 30 
Asn Ser Ber He Val Tyr Glu Ala Ala Asp 

TTG ATC ATG CAC ACC CCC BGB TGC 6TB CCC 60 
Leu lie Met His Thr Pro Gly Cys Val Pro 

11 1= 
TGC GTT CGB BAG AAC AAC CTC TCC CGC TGC 90 
Cys Val Arg Blu Asn Asn Leu Ser Arg Cys 

El 25 
TGG BTA GCG CTC ACT CCC ACS CTT BCG BCD lEO 
Trp val Ala Leu Thr Pro Thr Leu Ala "Ala 

■as AO 
31 -35 

AGS AAT GTC ABC GTC CCC ACA GCA ACA ATA 150 
Aro Asn Val Ser Val Pro Thr Ala Thr lie 

CGA 
Arg 
51 
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SEQ ID N3 

SEQUENCE TYPE: Nucleotide with corresponding protein 
LENGTH OF THE SEQUENCE: 156 base pairs 
CONFORMATION: single helix 
TOPOLOGY: linear 

MOLECULAR TYPE: cDNA from genomic RNA 
HYPOTHETIC SEQUENCE: no 
ANTI-SENSE: no 

ORIGINAL SOURCE: HCV virus variant 3 

EXPERIMENTAL SOURCE: genic library from viral isolate 

CHARACTERISTICS: coding for a portion of the env 

protein env 3 variant 

IDENTIFICATION METHOD: experimental 

PROPERTY: coding sequence 

TAT BAB BCA BCB 6AC 30 
Tyr Glu Ala Ala Asp 
10 

CCC GGG TGC GTB CCC 60 
Pro Gly Cys Vel Pro 
SO 

BAG AAC CAC TCC CGC 90 
Asp Asn His Ser Arg 

,30 " 

ACT CCC ACT CTC GCG lEO 
Thr Pro Thr Leu Ala 
AO 

GTC CCC ACC ACS ACA 150 
Val Pro Thr Thr Thr 

...50 

1 le Arg 
51 
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AAC TCA AGT 
Asn 5er Ser 
1 

CTB ATC ATG 
Leu lie Met 
11 

TGC GTT CGG 
Cys Val Arg 
■ El 

TGC TG6 GTA 
Cys Trp Val 
31 

GCC AGG AAT 
Ala Arg Asn 
<*1 



ATT BTG 
He Val 
5 

CAC ACC 
His Thr 
15 

GAA GGA 
Glu Gly 
. . 25 
GCG CTC 
Ala Leu 
35 

ABC AGC 
5er Ser 
•A5 
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CLAIMS 

1. An amino acid sequence characterized in that 
it comprises an epitope of the protein env of the virus 

5 HCV. 

2. An amino acid sequence according to claim 1, 
characterized in that it is comprised within the 
portion from the amino acid 209 to the amino acid 259 
according to the numbering of Choo, Q.-L. et al., 

10 Science (1988}, 244:359362 (3). 

3. An amino acid sequence according to claim 2, 
characterized in that it is comprised in the SEQ ID Nl. 

4. An amino acid sequence according to claim 3, 
characterized in that it comprises the portion from the 

15 amino acid 13 to the amino acid 46 of SEQ ID Nl. 

5. An amino acid sequence according to claim 3, 
characterized in that it comprises the portion from the 
amino acid 21 to the amino acid 30 of SEQ ID Nl.. 

6. An amino acid sequence according to claim 2, 
20 characterized in that it is comprised in the SEQ ID N2. 

7. An cuaino acid sequence according to claim 6, 
characterized in that it comprises the portion from the 
amino acid 13 to the amino acid 46 of the SEQ ID N2. 

8. An amino acid sequence according to claim 6, 
25 characterized in that it comprises the portion from the 

amino acid 21 to the amino acid 30 of SEQ ID N2. 
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9. An amino acid sequence according to claim 2, 
characterized in that it is comprised within the SEQ ID 
N3. 

10. An amino acid sequence according to claim 9, 
5 characterized in that it comprises the portion from the 

amino acid 13 to the amino acid 47 of SEQ ID N3. 

11. An amino acid sequence according to claim 11, 
characterized in that it comprises the portion from the 
amino acid 21 to the amino acid 31 of SEQ ID N3. 

10 12. Peptides characterized in that they have the 

amino acid sequence according to 'any one of the 

preceding claims. 

13. Peptides according to claim 13 characterized 
in that they are synthetic peptides. 
15 14. Peptides according to claim 12 or 13 

characterized in that they have a conformational 
structure able to increase the immunogenicity thereof. 

15. Peptides according to claim 14 characterized 
in that said conformational structure is achieved by 

20 reacting two residues of cysteine and by cyclizing the 
peptide. 

16. A nucleotide sequence coding for an epitope 

of the env protein. 

17. A nucleotide sequence according to claim 16, 
25 characterized in that it is comprised in the sequence 

SEQ ID Nl. 
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18. A nucleotide sequence according to claim 16, 
characterized in that it is compris d in the sequence 
SEQ ID N2. 

19. A nucleotide sequence according to claim 16, 
5 characterized in that it is comprised in the sequence 

SEQ ID N3. 

20. A nucleotide sequence according to claim 19, 
characterized in that it comprises at least one 
fragment of 10 nucleotides of the SEQ ID N3. 

IQ 21. A nucleotide sequence according to claim 16, 

characterized in that it comprises the sequence SEQ ID 
N3. 

15 
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HYDROPHILIC PROFILE OF THE PROTEIC 
SEQUENCE HCVENV2 CALCULATED ON THE BASIS 
OF AN AVERAGE LENGHT OF G AMINOACIDS 
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FIG.1C 
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